Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 39(9)2023 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-37725353

RESUMO

MOTIVATION: Living a Big Data era in Biomedicine, there is an unmet need to systematically assess experimental observations in the context of available information. This assessment would offer a means for a comprehensive and robust validation of biomedical data results and provide an initial estimate of the potential novelty of the findings. RESULTS: Here we present BQsupports, a web-based tool built upon the Bioteque biomedical descriptors that systematically analyzes and quantifies the current support to a given set of observations. The tool relies on over 1000 distinct types of biomedical descriptors, covering over 11 different biological and chemical entities, including genes, cell lines, diseases, and small molecules. By exploring hundreds of descriptors, BQsupports provide support scores for each observation across a wide variety of biomedical contexts. These scores are then aggregated to summarize the biomedical support of the assessed dataset as a whole. Finally, the BQsupports also suggests predictive features of the given dataset, which can be exploited in downstream machine learning applications. AVAILABILITY AND IMPLEMENTATION: The web application and underlying data are available online (https://bqsupports.irbbarcelona.org).


Assuntos
Aprendizado de Máquina , Software , Big Data
2.
Nat Commun ; 13(1): 5304, 2022 09 09.
Artigo em Inglês | MEDLINE | ID: mdl-36085310

RESUMO

Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., 'drug treats disease', 'gene interacts with gene'). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.


Assuntos
Conhecimento , Reconhecimento Automatizado de Padrão , Bases de Conhecimento , Aprendizado de Máquina , Proteínas
3.
Cell Rep Med ; 3(1): 100492, 2022 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-35106508

RESUMO

The Columbia Cancer Target Discovery and Development (CTD2) Center is developing PANACEA, a resource comprising dose-responses and RNA sequencing (RNA-seq) profiles of 25 cell lines perturbed with ∼400 clinical oncology drugs, to study a tumor-specific drug mechanism of action. Here, this resource serves as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. Dose-response and perturbational profiles for 32 kinase inhibitors are provided to 21 teams who are blind to the identity of the compounds. The teams are asked to predict high-affinity binding targets of each compound among ∼1,300 targets cataloged in DrugBank. The best performing methods leverage gene expression profile similarity analysis as well as deep-learning methodologies trained on individual datasets. This study lays the foundation for future integrative analyses of pharmacogenomic data, reconciliation of polypharmacology effects in different tumor contexts, and insights into network-based assessments of drug mechanisms of action.


Assuntos
Neoplasias/tratamento farmacológico , Polifarmacologia , Algoritmos , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Redes Neurais de Computação , Proteínas Quinases/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transcrição Gênica
4.
Nat Commun ; 12(1): 3932, 2021 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-34168145

RESUMO

Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.


Assuntos
Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/farmacologia , Relação Estrutura-Atividade , Linhagem Celular Tumoral , Bases de Dados de Produtos Farmacêuticos , Avaliação Pré-Clínica de Medicamentos/métodos , Humanos , Fatores de Transcrição da Família Snail/antagonistas & inibidores , Fatores de Transcrição da Família Snail/genética , Fatores de Transcrição da Família Snail/metabolismo
5.
J Chem Inf Model ; 60(12): 5730-5734, 2020 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-32672454

RESUMO

Until a vaccine becomes available, the current repertoire of drugs is our only therapeutic asset to fight the SARS-CoV-2 outbreak. Indeed, emergency clinical trials have been launched to assess the effectiveness of many marketed drugs, tackling the decrease of viral load through several mechanisms. Here, we present an online resource, based on small-molecule bioactivity signatures and natural language processing, to expand the portfolio of compounds with potential to treat COVID-19. By comparing the set of drugs reported to be potentially active against SARS-CoV-2 to a universe of 1 million bioactive molecules, we identify compounds that display analogous chemical and functional features to the current COVID-19 candidates. Searches can be filtered by level of evidence and mechanism of action, and results can be restricted to drug molecules or include the much broader space of bioactive compounds. Moreover, we allow users to contribute COVID-19 drug candidates, which are automatically incorporated to the pipeline once per day. The computational platform, as well as the source code, is available at https://sbnb.irbbarcelona.org/covid19.


Assuntos
Antivirais/química , Tratamento Farmacológico da COVID-19 , Reposicionamento de Medicamentos/métodos , SARS-CoV-2/efeitos dos fármacos , Antivirais/farmacologia , Simulação por Computador , Desenho de Fármacos , Humanos , Modelos Moleculares , Estrutura Molecular , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/farmacologia
6.
Nat Biotechnol ; 38(9): 1098, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32440008

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

7.
Nat Biotechnol ; 38(9): 1087-1096, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32440005

RESUMO

Small molecules are usually compared by their chemical structure, but there is no unified analytic framework for representing and comparing their biological activity. We present the Chemical Checker (CC), which provides processed, harmonized and integrated bioactivity data on ~800,000 small molecules. The CC divides data into five levels of increasing complexity, from the chemical properties of compounds to their clinical outcomes. In between, it includes targets, off-targets, networks and cell-level information, such as omics data, growth inhibition and morphology. Bioactivity data are expressed in a vector format, extending the concept of chemical similarity to similarity between bioactivity signatures. We show how CC signatures can aid drug discovery tasks, including target identification and library characterization. We also demonstrate the discovery of compounds that reverse and mimic biological signatures of disease models and genetic perturbations in cases that could not be addressed using chemical information alone. Overall, the CC signatures facilitate the conversion of bioactivity data to a format that is readily amenable to machine learning methods.


Assuntos
Preparações Farmacêuticas/metabolismo , Bibliotecas de Moléculas Pequenas/metabolismo , Produtos Biológicos/química , Produtos Biológicos/metabolismo , Produtos Biológicos/uso terapêutico , Biomarcadores Farmacológicos/metabolismo , Bases de Dados Factuais , Descoberta de Drogas , Tratamento Farmacológico , Humanos , Preparações Farmacêuticas/química , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/uso terapêutico
8.
Proteins ; 87(12): 1378-1387, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31571280

RESUMO

Critical blind assessment of structure prediction techniques is crucial for the scientific community to establish the state of the art, identify bottlenecks, and guide future developments. In Critical Assessment of Techniques in Structure Prediction (CASP), human experts assess the performance of participating methods in relation to the difficulty of the prediction task in a biennial experiment on approximately 100 targets. Yet, the development of automated computational modeling methods requires more frequent evaluation cycles and larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements CASP by conducting fully automated blind prediction evaluations based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the Protein Data Bank (PDB). Each week, CAMEO publishes benchmarking results for predictions corresponding to a set of about 20 targets collected during a 4-day prediction window. CAMEO benchmarking data are generated consistently for all methods at the same point in time, enabling developers to cross-validate their method's performance, and referring to their results in publications. Many successful participants of CASP have used CAMEO-either by directly benchmarking their methods within the system or by comparing their own performance to CAMEO reference data. CAMEO offers a variety of scores reflecting different aspects of structure modeling, for example, binding site accuracy, homo-oligomer interface quality, or accuracy of local model confidence estimates. By introducing the "bestSingleTemplate" method based on structure superpositions as a reference for the accuracy of 3D modeling predictions, CAMEO facilitates objective comparison of techniques and fosters the development of advanced methods.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Software , Algoritmos , Benchmarking , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Modelos Moleculares , Dobramento de Proteína , Proteínas/química , Proteínas/genética , Análise de Sequência de Proteína
9.
Methods Mol Biol ; 1851: 301-316, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30298405

RESUMO

Proteins are subject to evolutionary forces that shape their three-dimensional structure to meet specific functional demands. The knowledge of the structure of a protein is therefore instrumental to gain information about the molecular basis of its function. However, experimental structure determination is inherently time consuming and expensive, making it impossible to follow the explosion of sequence data deriving from genome-scale projects. As a consequence, computational structural modeling techniques have received much attention and established themselves as a valuable complement to experimental structural biology efforts. Among these, comparative modeling remains the method of choice to model the three-dimensional structure of a protein when homology to a protein of known structure can be detected.The general strategy consists of using experimentally determined structures of proteins as templates for the generation of three-dimensional models of related family members (targets) of which the structure is unknown. This chapter provides a description of the individual steps needed to obtain a comparative model using SWISS-MODEL, one of the most widely used automated servers for protein structure homology modeling.


Assuntos
Proteínas/química , Biologia Computacional , Modelos Moleculares , Proteínas/classificação , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína
10.
J Mol Biol ; 430(21): 4431-4438, 2018 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-30274705

RESUMO

Multi-protein machines are responsible for most cellular tasks, and many efforts have been invested in the systematic identification and characterization of thousands of these macromolecular assemblies. However, unfortunately, the (quasi) atomic details necessary to understand their function are available only for a tiny fraction of the known complexes. The computational biology community is developing strategies to integrate structural data of different nature, from electron microscopy to X-ray crystallography, to model large molecular machines, as it has been done for individual proteins and interactions with remarkable success. However, unlike for binary interactions, there is no reliable gold-standard set of three-dimensional (3D) complexes to benchmark the performance of these methodologies and detect their limitations. Here, we present a strategy to dynamically generate non-redundant sets of 3D heteromeric complexes with three or more components. By changing the values of sequence identity and component overlap between assemblies required to define complex redundancy, we can create sets of representative complexes with known 3D structure (i.e., target complexes). Using an identity threshold of 20% and imposing a fraction of component overlap of <0.5, we identify 495 unique target complexes, which represent a real non-redundant set of heteromeric assemblies with known 3D structure. Moreover, for each target complex, we also identify a set of assemblies, of varying degrees of identity and component overlap, that can be readily used as input in a complex modeling exercise (i.e., template subcomplexes). We hope that resources like this will significantly help the development and progress assessment of novel methodologies, as docking benchmarks and blind prediction contests did. The interactive resource is accessible at https://DynBench3D.irbbarcelona.org.


Assuntos
Biologia Computacional/métodos , Complexos Multiproteicos/química , Benchmarking , Cristalografia por Raios X , Bases de Dados de Proteínas , Internet , Microscopia Eletrônica , Modelos Moleculares , Peso Molecular , Software
11.
Nucleic Acids Res ; 46(W1): W296-W303, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29788355

RESUMO

Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.


Assuntos
Internet , Conformação Proteica , Proteínas/genética , Software , Bases de Dados de Proteínas , Modelos Químicos , Simulação de Dinâmica Molecular , Proteínas/química , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína
12.
Proteins ; 86 Suppl 1: 387-398, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29178137

RESUMO

Every second year, the community experiment "Critical Assessment of Techniques for Structure Prediction" (CASP) is conducting an independent blind assessment of structure prediction methods, providing a framework for comparing the performance of different approaches and discussing the latest developments in the field. Yet, developers of automated computational modeling methods clearly benefit from more frequent evaluations based on larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements the CASP experiment by conducting fully automated blind prediction assessments based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the PDB Protein Data Bank. CAMEO publishes weekly benchmarking results based on models collected during a 4-day prediction window, on average assessing ca. 100 targets during a time frame of 5 weeks. CAMEO benchmarking data is generated consistently for all participating methods at the same point in time, enabling developers to benchmark and cross-validate their method's performance, and directly refer to the benchmarking results in publications. In order to facilitate server development and promote shorter release cycles, CAMEO sends weekly email with submission statistics and low performance warnings. Many participants of CASP have successfully employed CAMEO when preparing their methods for upcoming community experiments. CAMEO offers a variety of scores to allow benchmarking diverse aspects of structure prediction methods. By introducing new scoring schemes, CAMEO facilitates new development in areas of active research, for example, modeling quaternary structure, complexes, or ligand binding sites.


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Conformação Proteica , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína/métodos , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Ligantes , Ligação Proteica
13.
Proteins ; 86 Suppl 1: 247-256, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29071742

RESUMO

We present the results of the first independent assessment of protein assemblies in CASP. A total of 1624 oligomeric models were submitted by 108 predictor groups for the 30 oligomeric targets in the CASP12 edition. We evaluated the accuracy of oligomeric predictions by comparison to their reference structures at the interface patch and residue contact levels. We find that interface patches are more reliably predicted than the specific residue contacts. Whereas none of the 15 hard oligomeric targets have successful predictions for the residue contacts at the interface, six have models with resemblance in the interface patch. Successful predictions of interface patch and contacts exist for all targets suitable for homology modeling, with at least one group improving over the best available template for each target. However, the participation in protein assembly prediction is low and uneven. Three human groups are closely ranked at the top by overall performance, but a server outperforms all other predictors for targets suitable for homology modeling. The state of the art of protein assembly prediction methods is in development and has apparent room for improvement, especially for assemblies without templates.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Modelos Moleculares , Simulação de Dinâmica Molecular , Conformação Proteica , Proteínas/química , Algoritmos , Humanos , Dobramento de Proteína , Análise de Sequência de Proteína
14.
Sci Rep ; 7(1): 10480, 2017 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-28874689

RESUMO

Cellular processes often depend on interactions between proteins and the formation of macromolecular complexes. The impairment of such interactions can lead to deregulation of pathways resulting in disease states, and it is hence crucial to gain insights into the nature of macromolecular assemblies. Detailed structural knowledge about complexes and protein-protein interactions is growing, but experimentally determined three-dimensional multimeric assemblies are outnumbered by complexes supported by non-structural experimental evidence. Here, we aim to fill this gap by modeling multimeric structures by homology, only using amino acid sequences to infer the stoichiometry and the overall structure of the assembly. We ask which properties of proteins within a family can assist in the prediction of correct quaternary structure. Specifically, we introduce a description of protein-protein interface conservation as a function of evolutionary distance to reduce the noise in deep multiple sequence alignments. We also define a distance measure to structurally compare homologous multimeric protein complexes. This allows us to hierarchically cluster protein structures and quantify the diversity of alternative biological assemblies known today. We find that a combination of conservation scores, structural clustering, and classical interface descriptors, can improve the selection of homologous protein templates leading to reliable models of protein complexes.


Assuntos
Multimerização Proteica , Análise de Sequência de Proteína/métodos , Animais , Frutose-Bifosfato Aldolase/química , Humanos , Ligação Proteica , Conformação Proteica , Homologia de Sequência de Aminoácidos
15.
Nucleic Acids Res ; 42(Web Server issue): W252-8, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24782522

RESUMO

Protein structure homology modelling has become a routine technique to generate 3D models for proteins when experimental structures are not available. Fully automated servers such as SWISS-MODEL with user-friendly web interfaces generate reliable models without the need for complex software packages or downloading large databases. Here, we describe the latest version of the SWISS-MODEL expert system for protein structure modelling. The SWISS-MODEL template library provides annotation of quaternary structure and essential ligands and co-factors to allow for building of complete structural models, including their oligomeric structure. The improved SWISS-MODEL pipeline makes extensive use of model quality estimation for selection of the most suitable templates and provides estimates of the expected accuracy of the resulting models. The accuracy of the models generated by SWISS-MODEL is continuously evaluated by the CAMEO system. The new web site allows users to interactively search for templates, cluster them by sequence similarity, structurally compare alternative templates and select the ones to be used for model building. In cases where multiple alternative template structures are available for a protein of interest, a user-guided template selection step allows building models in different functional states. SWISS-MODEL is available at http://swissmodel.expasy.org/.


Assuntos
Modelos Moleculares , Estrutura Quaternária de Proteína , Estrutura Terciária de Proteína , Software , Homologia Estrutural de Proteína , Evolução Molecular , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...